A Comparison Study of Parsers for Patent Machine Translation
نویسندگان
چکیده
Machine translation of patent documents is very important from a practical point of view. One of the key technologies for improving machine translation quality is the utilization of syntax. It is difficult to select the appropriate parser for patent translation because the effects of each parser on patent translation are not clear. This paper provides comparative evaluation of several state-of-the-art parsers for English, focusing on the effects for patent machine translation from English to Japanese. We measured how much each parser contributed to improve translation quality when the parser was used to obtain the syntax of input sentences. In addition, we examined the effects of a method using parsed documentlevel context containing the input sentence to determine noun phrases (Onishi et al., 2011). We conducted experiments using the NTCIR8 patent translation task dataset. Most of the parsers improved translation quality. When the method using document-level context was applied, all of the compared parsers improved translation quality.
منابع مشابه
An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
Machine translation of patent documents is very important from a practical point of view. One of the key technologies for improving machine translation quality is the utilization of syntax. It is difficult to select the appropriate parser for English to Japanese patent machine translation because the effects of each parser on patent translation are not clear. This paper provides an empirical co...
متن کاملA Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملImprovements to Syntax-based Machine Translation using Ensemble Dependency Parsers
Dependency parsers are almost ubiquitously evaluated on their accuracy scores, these scores say nothing of the complexity and usefulness of the resulting structures. The structures may have more complexity due to their coordination structure or attachment rules. As dependency parses are basic structures in which other systems are built upon, it would seem more reasonable to judge these parsers ...
متن کاملParsers as language models for statistical machine translation
Most work in syntax-based machine translation has been in translation modeling, but there are many reasons why we may instead want to focus on the language model. We experiment with parsers as language models for machine translation in a simple translation model. This approach demands much more of the language models, allowing us to isolate their strengths and weaknesses. We find that unmodifie...
متن کاملComparison of SMT and NMT trained with large Patent Corpora: Japio at WAT2017
Japan Patent Information Organization (Japio) participates in patent subtasks (JPC-EJ/JE/CJ/KJ) with phrase-based statistical machine translation (SMT) and neural machine translation (NMT) systems which are trained with its own patent corpora in addition to the subtask corpora provided by organizers of WAT2017. In EJ and CJ subtasks, SMT and NMT systems whose sizes of training corpora are about...
متن کامل